[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Commit-gnuradio] r7921 - in gnuradio/branches/developers/eb/gcell-wip:
From: |
eb |
Subject: |
[Commit-gnuradio] r7921 - in gnuradio/branches/developers/eb/gcell-wip: . src/apps src/apps/spu src/lib src/lib/spu src/spu-include |
Date: |
Tue, 4 Mar 2008 08:51:02 -0700 (MST) |
Author: eb
Date: 2008-03-04 08:51:02 -0700 (Tue, 04 Mar 2008)
New Revision: 7921
Added:
gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/
gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/Makefile.am
gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/gcell_qa.c
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_main.c
gnuradio/branches/developers/eb/gcell-wip/src/spu-include/gc_delay.h
Removed:
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_delay.h
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_spu_procs.c
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gcell_spu_main.c
Modified:
gnuradio/branches/developers/eb/gcell-wip/configure.ac
gnuradio/branches/developers/eb/gcell-wip/src/apps/Makefile.am
gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager.h
gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager_impl.cc
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/Makefile.am
Log:
gcell work-in-progress
Modified: gnuradio/branches/developers/eb/gcell-wip/configure.ac
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/configure.ac 2008-03-04
15:33:43 UTC (rev 7920)
+++ gnuradio/branches/developers/eb/gcell-wip/configure.ac 2008-03-04
15:51:02 UTC (rev 7921)
@@ -206,6 +206,7 @@
config/Makefile \
src/Makefile \
src/apps/Makefile \
+ src/apps/spu/Makefile \
src/include/Makefile \
src/lib/Makefile \
src/lib/spu/Makefile \
Modified: gnuradio/branches/developers/eb/gcell-wip/src/apps/Makefile.am
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/apps/Makefile.am
2008-03-04 15:33:43 UTC (rev 7920)
+++ gnuradio/branches/developers/eb/gcell-wip/src/apps/Makefile.am
2008-03-04 15:51:02 UTC (rev 7921)
@@ -1,5 +1,5 @@
#
-# Copyright 2007 Free Software Foundation, Inc.
+# Copyright 2007,2008 Free Software Foundation, Inc.
#
# This file is part of GNU Radio
#
@@ -20,6 +20,8 @@
include $(top_srcdir)/Makefile.common
+SUBDIRS = spu .
+
INCLUDES = $(STD_DEFINES_AND_INCLUDES) $(CPPUNIT_INCLUDES)
# list of programs run by "make check" and "make distcheck"
@@ -40,7 +42,7 @@
test_all_SOURCES = test_all.cc
benchmark_dma_SOURCES = benchmark_dma.cc
-benchmark_dma_LDADD = $(STDLIBS) -lmblock
+benchmark_dma_LDADD = spu/gcell_qa $(STDLIBS) -lmblock
benchmark_nop_SOURCES = benchmark_nop.cc
-benchmark_nop_LDADD = $(STDLIBS) -lmblock
+benchmark_nop_LDADD = spu/gcell_qa $(STDLIBS) -lmblock
Property changes on: gnuradio/branches/developers/eb/gcell-wip/src/apps/spu
___________________________________________________________________
Name: svn:ignore
+ Makefile
Makefile.in
*.a
*.la
*.lo
.deps
.libs
test_spu
gcell_spu_main
gcell_qa
Added: gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/Makefile.am
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/Makefile.am
(rev 0)
+++ gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/Makefile.am
2008-03-04 15:51:02 UTC (rev 7921)
@@ -0,0 +1,32 @@
+#
+# Copyright 2008 Free Software Foundation, Inc.
+#
+# This file is part of GNU Radio
+#
+# GNU Radio is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 3, or (at your option)
+# any later version.
+#
+# GNU Radio is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License along
+# with this program; if not, write to the Free Software Foundation, Inc.,
+# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+#
+
+
+include $(top_srcdir)/Makefile.common.spu
+
+AM_CPPFLAGS = $(SPU_DEFINES_AND_INCLUDES) $(IBM_SPU_SYNC_INCLUDES)
+
+# SPU executables
+
+
+LDADD = ../../lib/spu/libgcell_spu.a
+
+noinst_PROGRAMS = \
+ gcell_qa
Property changes on:
gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/Makefile.am
___________________________________________________________________
Name: svn:eol-style
+ native
Copied: gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/gcell_qa.c (from
rev 7919, gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_spu_procs.c)
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/gcell_qa.c
(rev 0)
+++ gnuradio/branches/developers/eb/gcell-wip/src/apps/spu/gcell_qa.c
2008-03-04 15:51:02 UTC (rev 7921)
@@ -0,0 +1,129 @@
+/* -*- c++ -*- */
+/*
+ * Copyright 2008 Free Software Foundation, Inc.
+ *
+ * This file is part of GNU Radio
+ *
+ * GNU Radio is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 3, or (at your option)
+ * any later version.
+ *
+ * GNU Radio is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <gc_delay.h>
+#include <gc_declare_proc.h>
+#include <string.h>
+
+
+#define _UNUSED __attribute__((unused))
+
+// FIXME move these out of here; only for QA usage
+
+static void
+qa_nop(const gc_job_direct_args_t *input _UNUSED,
+ gc_job_direct_args_t *output _UNUSED,
+ const gc_job_ea_args_t *eaa _UNUSED)
+{
+}
+
+
+static void
+qa_udelay(const gc_job_direct_args_t *input,
+ gc_job_direct_args_t *output _UNUSED,
+ const gc_job_ea_args_t *eaa _UNUSED)
+{
+ gc_udelay(input->arg[0].u32);
+}
+
+static int
+sum_shorts(short *p, int nshorts)
+{
+ int total = 0;
+ for (int i = 0; i < nshorts; i++)
+ total += p[i];
+
+ return total;
+}
+
+static void
+qa_sum_shorts(const gc_job_direct_args_t *input _UNUSED,
+ gc_job_direct_args_t *output,
+ const gc_job_ea_args_t *eaa)
+{
+ for (unsigned int i = 0; i < eaa->nargs; i++){
+ short *p = eaa->arg[i].ls_addr;
+ int n = eaa->arg[i].get_size / sizeof(short);
+ output->arg[i].s32 = sum_shorts(p, n);
+ //printf("qa_sum_shorts(%p, %d) = %d\n", p, n, output->arg[i].s32);
+ }
+}
+
+static void
+write_seq(unsigned char *p, int nbytes, int counter)
+{
+ for (int i = 0; i < nbytes; i++)
+ p[i] = counter++;
+}
+
+static void
+qa_put_seq(const gc_job_direct_args_t *input,
+ gc_job_direct_args_t *output _UNUSED,
+ const gc_job_ea_args_t *eaa)
+{
+ int counter = input->arg[0].s32;
+
+ for (unsigned int i = 0; i < eaa->nargs; i++){
+ unsigned char *p = eaa->arg[i].ls_addr;
+ int n = eaa->arg[i].put_size;
+ write_seq(p, n, counter);
+ counter += n;
+ }
+}
+
+static void
+qa_put_zeros(const gc_job_direct_args_t *input _UNUSED,
+ gc_job_direct_args_t *output _UNUSED,
+ const gc_job_ea_args_t *eaa)
+{
+ for (unsigned int i = 0; i < eaa->nargs; i++){
+ if (eaa->arg[i].direction == GCJD_DMA_PUT)
+ memset(eaa->arg[i].ls_addr, 0, eaa->arg[i].put_size);
+ }
+}
+
+static void
+qa_copy(const gc_job_direct_args_t *input _UNUSED,
+ gc_job_direct_args_t *output,
+ const gc_job_ea_args_t *eaa)
+{
+ if (eaa->nargs != 2
+ || eaa->arg[0].direction != GCJD_DMA_PUT
+ || eaa->arg[1].direction != GCJD_DMA_GET){
+ output->arg[0].s32 = -1;
+ return;
+ }
+
+ output->arg[0].s32 = 0;
+ unsigned n = eaa->arg[0].put_size;
+ if (eaa->arg[1].get_size < n)
+ n = eaa->arg[1].get_size;
+
+ memcpy(eaa->arg[0].ls_addr, eaa->arg[1].ls_addr, n);
+}
+
+GC_DECLARE_PROC(qa_nop, "qa_nop");
+GC_DECLARE_PROC(qa_udelay, "qa_udelay");
+GC_DECLARE_PROC(qa_sum_shorts, "qa_sum_shorts");
+GC_DECLARE_PROC(qa_put_seq, "qa_put_seq");
+GC_DECLARE_PROC(qa_put_zeros, "qa_put_zeros");
+GC_DECLARE_PROC(qa_copy, "qa_copy");
+
Modified: gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager.h
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager.h
2008-03-04 15:33:43 UTC (rev 7920)
+++ gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager.h
2008-03-04 15:51:02 UTC (rev 7921)
@@ -25,6 +25,7 @@
#include <boost/utility.hpp>
#include <vector>
#include <string>
+#include <libspe2.h>
#include "gc_job_desc.h"
class gc_job_manager;
@@ -44,10 +45,12 @@
unsigned int nspes; // how many SPEs shall we use? 0 -> all of
them
bool gang_schedule; // shall we gang schedule?
bool use_affinity; // shall we try for affinity (FIXME not
implmented)
+ spe_program_handle_t *program_handle; // program to load into SPEs
gc_jm_options() :
max_jobs(0), max_client_threads(0), nspes(0),
- gang_schedule(true), use_affinity(false)
+ gang_schedule(true), use_affinity(false),
+ program_handle(0)
{
}
};
Modified:
gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager_impl.cc
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager_impl.cc
2008-03-04 15:33:43 UTC (rev 7920)
+++ gnuradio/branches/developers/eb/gcell-wip/src/lib/gc_job_manager_impl.cc
2008-03-04 15:51:02 UTC (rev 7921)
@@ -72,6 +72,7 @@
}
};
+
// custom deleter of anything that can be freed with "free"
class free_deleter {
public:
@@ -137,6 +138,11 @@
if (d_options.max_client_threads == 0)
d_options.max_client_threads = DEFAULT_MAX_CLIENT_THREADS;
+ if (d_options.program_handle == 0){
+ fprintf(stderr, "gc_job_manager: options->program_handle must be
non-zero");
+ throw std::runtime_error("gc_job_manager: options->program_handle must be
non-zero");
+ }
+
int ncpu_nodes = spe_cpu_info_get(SPE_COUNT_PHYSICAL_CPU_NODES, -1);
int nusable_spes = spe_cpu_info_get(SPE_COUNT_USABLE_SPES, -1);
@@ -217,8 +223,9 @@
// get a handle to the spe program
+#if 0
// FIXME pass this in (or something)
- const char *spu_progname = "../lib/spu/gcell_qa";
+ const char *spu_progname = "../apps/spu/gcell_qa";
spe_program_handle_t *spe_image = spe_image_open(spu_progname);
if (spe_image == 0){
@@ -229,9 +236,15 @@
}
d_spe_image = spe_program_handle_sptr(spe_image,
spe_program_handle_deleter());
+#else
+
+ spe_program_handle_t *spe_image = d_options.program_handle;
+
+#endif
+
// fish proc_def table out of SPE ELF file
- if (!gcpd_find_table(d_spe_image.get(), &d_proc_def, &d_nproc_defs,
&d_proc_def_ls_addr)){
+ if (!gcpd_find_table(spe_image, &d_proc_def, &d_nproc_defs,
&d_proc_def_ls_addr)){
fprintf(stderr, "gc_job_manager_impl: couldn't find gc_proc_defs in SPE
ELF file.\n");
throw std::runtime_error("no gc_proc_defs");
}
Modified: gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/Makefile.am
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/Makefile.am
2008-03-04 15:33:43 UTC (rev 7920)
+++ gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/Makefile.am
2008-03-04 15:51:02 UTC (rev 7921)
@@ -22,7 +22,7 @@
AM_CPPFLAGS = $(SPU_DEFINES_AND_INCLUDES) $(IBM_SPU_SYNC_INCLUDES)
-# libraray of SPU code
+# library of SPU code
noinst_LIBRARIES = \
libgcell_spu.a
@@ -31,14 +31,4 @@
gc_delay.c \
gc_spu_jd_queue.c \
spu_buffers.c \
- gcell_spu_main.c
-
-
-# SPU executables
-
-LDADD = libgcell_spu.a
-
-noinst_PROGRAMS = \
- gcell_qa
-
-gcell_qa_SOURCES = gc_spu_procs.c
+ gc_main.c
Deleted: gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_delay.h
Copied: gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_main.c (from
rev 7919,
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gcell_spu_main.c)
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_main.c
(rev 0)
+++ gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_main.c
2008-03-04 15:51:02 UTC (rev 7921)
@@ -0,0 +1,652 @@
+/* -*- c++ -*- */
+/*
+ * Copyright 2007,2008 Free Software Foundation, Inc.
+ *
+ * This file is part of GNU Radio
+ *
+ * GNU Radio is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 3, or (at your option)
+ * any later version.
+ *
+ * GNU Radio is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+
+#include <spu_intrinsics.h>
+#include <spu_mfcio.h>
+#include <sync_utils.h>
+#include "gc_spu_config.h"
+#include "gc_spu_args.h"
+#include "gc_job_desc.h"
+#include "gc_mbox.h"
+#include "gc_jd_queue.h"
+#include "gc_delay.h"
+#include "gc_declare_proc.h"
+#include "spu_buffers.h"
+#include <string.h>
+#include <assert.h>
+// #include <stdio.h>
+
+
+#define MIN(a,b) ((a) < (b) ? (a) : (b))
+#define MAX(a,b) ((a) > (b) ? (a) : (b))
+
+//! round x down to p2 boundary (p2 must be a power-of-2)
+#define ROUND_DN(x, p2) ((x) & ~((p2)-1))
+
+//! round x up to p2 boundary (p2 must be a power-of-2)
+#define ROUND_UP(x, p2) (((x)+((p2)-1)) & ~((p2)-1))
+
+
+#define USE_LLR_LOST_EVENT 0 // define to 0 or 1
+
+int gc_sys_tag; // tag for misc DMA operations
+static gc_spu_args_t spu_args;
+
+static struct gc_proc_def *gc_proc_def; // procedure entry points
+
+// ------------------------------------------------------------------------
+
+// state for DMA'ing arguments in and out
+
+static int get_tag; // 1 tag for job arg gets
+static int put_tags; // 2 tags for job arg puts
+
+static int pbi = 0; // current put buffer index (0 or 1)
+
+// bitmask (bit per put buffer): bit is set if DMA is started but not complete
+static int put_in_progress = 0;
+#define PBI_MASK(_pbi_) (1 << (_pbi_))
+
+// ------------------------------------------------------------------------
+
+// our working copy of the completion info
+static gc_comp_info_t comp_info = {
+ .in_use = 1,
+ .ncomplete = 0
+};
+
+static int ci_idx = 0; // index of current comp_info
+static int ci_tags; // two consecutive dma tags
+
+// ------------------------------------------------------------------------
+
+/*
+ * Wait until EA copy of comp_info[idx].in_use is 0
+ */
+static void
+wait_for_ppe_to_be_done_with_comp_info(int idx)
+{
+ char _tmp[256];
+ char *buf = (char *) ALIGN(_tmp, 128); // get cache-aligned buffer
+ gc_comp_info_t *p = (gc_comp_info_t *) buf;
+
+ do {
+ mfc_getllar(buf, spu_args.comp_info[idx], 0, 0);
+ spu_readch(MFC_RdAtomicStat);
+ if (p->in_use == 0)
+ return;
+
+ gc_udelay(5); // FIXME use the "lock-line reservation lost" event
+
+ } while (1);
+}
+
+static void
+flush_completion_info(void)
+{
+ if (comp_info.ncomplete == 0)
+ return;
+
+ // ensure that PPE is done with the buffer we're about to overwrite
+ wait_for_ppe_to_be_done_with_comp_info(ci_idx);
+
+ // dma the comp_info out to PPE
+ int tag = ci_tags + ci_idx;
+ mfc_put(&comp_info, spu_args.comp_info[ci_idx], sizeof(gc_comp_info_t), tag,
0, 0);
+
+ // we need to wait for the completion info to finish, as well as
+ // any EA argument puts.
+
+ int tag_mask = 1 << tag; // the comp_info tag
+ if (put_in_progress & PBI_MASK(0))
+ tag_mask |= (1 << (put_tags + 0));
+ if (put_in_progress & PBI_MASK(1))
+ tag_mask |= (1 << (put_tags + 1));
+
+ mfc_write_tag_mask(tag_mask); // the tags we're interested in
+ mfc_read_tag_status_all(); // wait for DMA to complete
+ put_in_progress = 0; // mark them all complete
+
+ // send PPE a message
+ spu_writech(SPU_WrOutIntrMbox, MK_MBOX_MSG(OP_JOBS_DONE, ci_idx));
+
+ ci_idx ^= 0x1; // switch buffers
+ comp_info.in_use = 1;
+ comp_info.ncomplete = 0;
+}
+
+// ------------------------------------------------------------------------
+
+static unsigned int backoff; // current backoff value in clock cycles
+static unsigned int _backoff_start;
+static unsigned int _backoff_cap;
+
+/*
+ * For 3.2 GHz SPE
+ *
+ * 12 4095 cycles 1.3 us
+ * 13 8191 cycles 2.6 us
+ * 14 16383 cycles 5.1 us
+ * 15 32767 cycles 10.2 us
+ * 16 20.4 us
+ * 17 40.8 us
+ * 18 81.9 us
+ * 19 163.8 us
+ * 20 327.7 us
+ * 21 655.4 us
+ */
+static unsigned char log2_backoff_start[16] = {
+// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
+// -------------------------------------------------------------
+ 12, 12, 12, 13, 13, 13, 13, 14, 14, 14, 14, 15, 15, 15, 16, 16
+};
+
+static unsigned char log2_backoff_cap[16] = {
+// 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
+// -------------------------------------------------------------
+ 17, 17, 17, 18, 18, 18, 18, 19, 19, 19, 19, 20, 20, 20, 21, 21
+};
+
+static void
+backoff_init(void)
+{
+ _backoff_cap = (1 << (log2_backoff_cap[(spu_args.nspus - 1) & 0xf])) - 1;
+ _backoff_start = (1 << (log2_backoff_start[(spu_args.nspus - 1) & 0xf])) - 1;
+
+ backoff = _backoff_start;
+}
+
+static void
+backoff_reset(void)
+{
+ backoff = _backoff_start;
+}
+
+static void
+backoff_delay(void)
+{
+ gc_cdelay(backoff);
+
+ // capped exponential backoff
+ backoff = ((backoff << 1) + 1) & _backoff_cap;
+}
+
+// ------------------------------------------------------------------------
+
+static inline unsigned int
+make_mask(int nbits)
+{
+ return ~(~0 << nbits);
+}
+
+static unsigned int dc_work;
+static int dc_put_tag;
+static unsigned char *dc_ls_base;
+static gc_eaddr_t dc_ea_base;
+
+// divide and conquer
+static void
+d_and_c(unsigned int offset, unsigned int len)
+{
+ unsigned int mask = make_mask(len) << offset;
+ unsigned int t = mask & dc_work;
+ if (t == 0) // nothing to do
+ return;
+ if (t == mask){ // got a match, generate dma
+ mfc_put(dc_ls_base + offset, dc_ea_base + offset, len, dc_put_tag, 0, 0);
+ }
+ else {
+ len >>= 1;
+ d_and_c(offset, len);
+ d_and_c(offset + len, len);
+ }
+}
+
+// Handle the nasty case of a dma xfer that's less than 16 bytes long.
+// len is guaranteed to be in [1, 15]
+
+static void
+handle_slow_and_tedious_dma(gc_eaddr_t ea, unsigned char *ls,
+ unsigned int len, int put_tag)
+{
+ // special case two likely cases, otherwise handle with divide and conquer
+ unsigned int t = (((uintptr_t) ls) | len) & 0x7;
+ if (1 && t == 0){ // 8 byte aligned and len is multiple of 8
+ mfc_put(ls, ea, 8, put_tag, 0, 0);
+ }
+ else if (1 && t == 4){ // 4-byte aligned and len is a multiple of 4
+ switch (len){
+ case 12:
+ mfc_put(ls + 8, ea + 8, 4, put_tag, 0, 0);
+ case 8:
+ mfc_put(ls + 4, ea + 4, 4, put_tag, 0, 0);
+ case 4:
+ mfc_put(ls + 0, ea + 0, 4, put_tag, 0, 0);
+ break;
+ }
+ }
+ else {
+ // General case (divide and conquer)
+ // This code is also correct for the two cases above
+ unsigned int alignment = ((uintptr_t) ls) & 0x7;
+ dc_work = make_mask(len) << alignment;
+ dc_ls_base = (unsigned char *) ROUND_DN((uintptr_t) ls, 8);
+ dc_ea_base = ROUND_DN(ea, (gc_eaddr_t) 8);
+
+ d_and_c( 0, 8);
+ d_and_c( 8, 8);
+ d_and_c(16, 8);
+ }
+}
+
+
+static void
+process_job(gc_eaddr_t jd_ea, gc_job_desc_t *jd)
+{
+ jd->status = JS_OK; // assume success
+
+ if (jd->proc_id >= spu_args.nproc_defs)
+ jd->status = JS_UNKNOWN_PROC;
+
+ else {
+
+ if (jd->eaa.nargs == 0)
+ (*gc_proc_def[jd->proc_id].proc)(&jd->input, &jd->output, &jd->eaa);
+
+ else { // handle EA args that must be DMA'd in/out
+
+ gc_job_ea_args_t *eaa = &jd->eaa;
+
+ int NELMS =
+ MAX(MAX_ARGS_EA,
+ (GC_SPU_BUFSIZE + MFC_MAX_DMA_SIZE - 1) / MFC_MAX_DMA_SIZE);
+
+ mfc_list_element_t dma_get_list[NELMS];
+ //mfc_list_element_t dma_put_list[NELMS];
+
+ memset(dma_get_list, 0, sizeof(dma_get_list));
+ //memset(dma_put_list, 0, sizeof(dma_put_list));
+
+ int gli = 0; // get list index
+ //int pli = 0; // put list index
+
+ unsigned char *get_base = _gci_getbuf[0];
+ unsigned char *get_t = get_base;
+ unsigned int total_get_dma_len = 0;
+
+ unsigned char *put_base = _gci_putbuf[pbi];
+ unsigned char *put_t = put_base;
+ unsigned int total_put_alloc = 0;
+ int put_tag = put_tags + pbi;
+
+ // Do we have any "put" args? If so ensure that previous
+ // dma from this buffer is complete
+
+ if ((jd->sys.direction_union & GCJD_DMA_PUT)
+ && (put_in_progress & PBI_MASK(pbi))){
+
+ mfc_write_tag_mask(1 << put_tag); // the tag we're interested in
+ mfc_read_tag_status_all(); // wait for DMA to complete
+ put_in_progress &= ~(PBI_MASK(pbi));
+ }
+
+
+ // for now, all EA's must have the same high 32-bits
+ gc_eaddr_t common_ea = eaa->arg[0].ea_addr;
+
+
+ // assign LS addresses for buffers
+
+ for (unsigned int i = 0; i < eaa->nargs; i++){
+
+ gc_eaddr_t ea_base = 0;
+ unsigned char *ls_base;
+ int offset;
+ unsigned int dma_len;
+
+ if (eaa->arg[i].direction == GCJD_DMA_GET){
+ ea_base = ROUND_DN(eaa->arg[i].ea_addr, (gc_eaddr_t) CACHE_LINE_SIZE);
+ offset = eaa->arg[i].ea_addr & (CACHE_LINE_SIZE-1);
+ dma_len = ROUND_UP(eaa->arg[i].get_size + offset, CACHE_LINE_SIZE);
+ total_get_dma_len += dma_len;
+
+ if (total_get_dma_len > GC_SPU_BUFSIZE){
+ jd->status = JS_ARGS_TOO_LONG;
+ goto wrap_up;
+ }
+
+ ls_base = get_t;
+ get_t += dma_len;
+ eaa->arg[i].ls_addr = ls_base + offset;
+
+ if (0){
+ assert((mfc_ea2l(eaa->arg[i].ea_addr) & 0x7f) ==
((intptr_t)eaa->arg[i].ls_addr & 0x7f));
+ assert((ea_base & 0x7f) == 0);
+ assert(((intptr_t)ls_base & 0x7f) == 0);
+ assert((dma_len & 0x7f) == 0);
+ assert((eaa->arg[i].get_size <= dma_len)
+ && dma_len <= (eaa->arg[i].get_size + offset +
CACHE_LINE_SIZE - 1));
+ }
+
+ // add to dma get list
+ // FIXME (someday) the dma lists is where the JS_BAD_EAH limitation
comes from
+
+ while (dma_len != 0){
+ int n = MIN(dma_len, MFC_MAX_DMA_SIZE);
+ dma_get_list[gli].size = n;
+ dma_get_list[gli].eal = mfc_ea2l(ea_base);
+ dma_len -= n;
+ ea_base += n;
+ gli++;
+ }
+ }
+
+ else if (eaa->arg[i].direction == GCJD_DMA_PUT){
+ //
+ // This case is a trickier than the PUT case since we can't
+ // write outside of the bounds of the user provided buffer.
+ // We still align the buffers to 128-bytes for good performance
+ // in the middle portion of the xfers.
+ //
+ ea_base = ROUND_DN(eaa->arg[i].ea_addr, (gc_eaddr_t) CACHE_LINE_SIZE);
+ offset = eaa->arg[i].ea_addr & (CACHE_LINE_SIZE-1);
+
+ uint32_t ls_alloc_len =
+ ROUND_UP(eaa->arg[i].put_size + offset, CACHE_LINE_SIZE);
+
+ total_put_alloc += ls_alloc_len;
+
+ if (total_put_alloc > GC_SPU_BUFSIZE){
+ jd->status = JS_ARGS_TOO_LONG;
+ goto wrap_up;
+ }
+
+ ls_base = put_t;
+ put_t += ls_alloc_len;
+ eaa->arg[i].ls_addr = ls_base + offset;
+
+ if (1){
+ assert((mfc_ea2l(eaa->arg[i].ea_addr) & 0x7f)
+ == ((intptr_t)eaa->arg[i].ls_addr & 0x7f));
+ assert((ea_base & 0x7f) == 0);
+ assert(((intptr_t)ls_base & 0x7f) == 0);
+ }
+ }
+
+ else
+ assert(0);
+ }
+
+ // fire off the dma to fetch the args and wait for it to complete
+ mfc_getl(get_base, common_ea, dma_get_list, gli*sizeof(dma_get_list[0]),
get_tag, 0, 0);
+ mfc_write_tag_mask(1 << get_tag); // the tag we're
interested in
+ mfc_read_tag_status_all(); // wait for DMA to complete
+
+ // do the work
+ (*gc_proc_def[jd->proc_id].proc)(&jd->input, &jd->output, &jd->eaa);
+
+
+ // Do we have any "put" args? If so copy them out
+ if (jd->sys.direction_union & GCJD_DMA_PUT){
+
+ // Do the copy out using single DMA xfers. The LS ranges
+ // aren't generally contiguous.
+
+ bool started_dma = false;
+
+ for (unsigned int i = 0; i < eaa->nargs; i++){
+ if (eaa->arg[i].direction == GCJD_DMA_PUT && eaa->arg[i].put_size !=
0){
+
+ started_dma = true;
+
+ gc_eaddr_t ea;
+ unsigned char *ls;
+ unsigned int len;
+
+ ea = eaa->arg[i].ea_addr;
+ ls = (unsigned char *) eaa->arg[i].ls_addr;
+ len = eaa->arg[i].put_size;
+
+ if (len < 16)
+ handle_slow_and_tedious_dma(ea, ls, len, put_tag);
+
+ else {
+ if ((ea & 0xf) != 0){
+
+ // printf("1: ea = 0x%x len = %5d\n", (int) ea, len);
+
+ // handle the "pre-multiple-of-16" portion
+ // do 1, 2, 4, or 8 byte xfers as required
+
+ if (ea & 0x1){ // do a 1-byte xfer
+ mfc_put(ls, ea, 1, put_tag, 0, 0);
+ ea += 1;
+ ls += 1;
+ len -= 1;
+ }
+ if (ea & 0x2){ // do a 2-byte xfer
+ mfc_put(ls, ea, 2, put_tag, 0, 0);
+ ea += 2;
+ ls += 2;
+ len -= 2;
+ }
+ if (ea & 0x4){ // do a 4-byte xfer
+ mfc_put(ls, ea, 4, put_tag, 0, 0);
+ ea += 4;
+ ls += 4;
+ len -= 4;
+ }
+ if (ea & 0x8){ // do an 8-byte xfer
+ mfc_put(ls, ea, 8, put_tag, 0, 0);
+ ea += 8;
+ ls += 8;
+ len -= 8;
+ }
+ }
+
+ if (1){
+ // printf("2: ea = 0x%x len = %5d\n", (int) ea, len);
+ assert((ea & 0xf) == 0);
+ assert((((intptr_t) ls) & 0xf) == 0);
+ }
+
+ // handle the "multiple-of-16" portion
+
+ int aligned_len = ROUND_DN(len, 16);
+ len = len & (16 - 1);
+
+ while (aligned_len != 0){
+ int dma_len = MIN(aligned_len, MFC_MAX_DMA_SIZE);
+ mfc_put(ls, ea, dma_len, put_tag, 0, 0);
+ ea += dma_len;
+ ls += dma_len;
+ aligned_len -= dma_len;
+ }
+
+ if (1){
+ // printf("3: ea = 0x%x len = %5d\n", (int)ea, len);
+ assert((ea & 0xf) == 0);
+ assert((((intptr_t) ls) & 0xf) == 0);
+ }
+
+ // handle "post-multiple-of-16" portion
+
+ if (len != 0){
+
+ if (len >= 8){ // do an 8-byte xfer
+ mfc_put(ls, ea, 8, put_tag, 0, 0);
+ ea += 8;
+ ls += 8;
+ len -= 8;
+ }
+ if (len >= 4){ // do a 4-byte xfer
+ mfc_put(ls, ea, 4, put_tag, 0, 0);
+ ea += 4;
+ ls += 4;
+ len -= 4;
+ }
+ if (len >= 2){ // do a 2-byte xfer
+ mfc_put(ls, ea, 2, put_tag, 0, 0);
+ ea += 2;
+ ls += 2;
+ len -= 2;
+ }
+ if (len >= 1){ // do a 1-byte xfer
+ mfc_put(ls, ea, 1, put_tag, 0, 0);
+ ea += 1;
+ ls += 1;
+ len -= 1;
+ }
+ if (1)
+ assert(len == 0);
+ }
+ }
+ }
+ }
+ if (started_dma){
+ put_in_progress |= PBI_MASK(pbi); // note it's running
+ pbi ^= 1; // toggle current buffer
+ }
+ }
+ }
+ }
+
+ wrap_up:; // semicolon creates null statement for C99 compliance
+
+ // Copy job descriptor back out to EA.
+ // (The dma will be waited on in flush_completion_info)
+ int tag = ci_tags + ci_idx; // use the current completion
tag
+ mfc_put(jd, jd_ea, sizeof(*jd), tag, 0, 0);
+
+
+ // Tell PPE we're done with the job.
+ //
+ // We queue these up until we run out of room, or until we can send
+ // the info to the PPE w/o blocking. The blocking check is in
+ // main_loop
+
+ comp_info.job_id[comp_info.ncomplete++] = jd->sys.job_id;
+
+ if (comp_info.ncomplete == GC_CI_NJOBS)
+ flush_completion_info();
+}
+
+static void
+main_loop(void)
+{
+ static gc_job_desc_t jd; // static gets us proper alignment
+ gc_eaddr_t jd_ea;
+
+ // setup events
+ spu_writech(SPU_WrEventMask, MFC_LLR_LOST_EVENT);
+ gc_jd_queue_getllar(spu_args.queue); // get a line reservation on the queue
+
+ while (1){
+
+#if (USE_LLR_LOST_EVENT)
+
+ if (unlikely(spu_readchcnt(SPU_RdEventStat))){
+ //
+ // execute standard event handling prologue
+ //
+ int status = spu_readch(SPU_RdEventStat);
+ int mask = spu_readch(SPU_RdEventMask);
+ spu_writech(SPU_WrEventMask, mask & ~status); // disable active events
+ spu_writech(SPU_WrEventAck, status); // ack active events
+
+ // execute per-event actions
+
+ if (status & MFC_LLR_LOST_EVENT){
+ //
+ // We've lost a line reservation. This is most likely caused
+ // by somebody doing something to the queue. Go look and see
+ // if there's anything for us.
+ //
+ if (gc_jd_queue_dequeue(spu_args.queue, &jd_ea, &jd))
+ process_job(jd_ea, &jd);
+
+ gc_jd_queue_getllar(spu_args.queue); // get a new reservation
+ }
+
+ //
+ // execute standard event handling epilogue
+ //
+ spu_writech(SPU_WrEventMask, mask); // restore event mask
+ }
+
+#else
+
+ // try to get a job from the job queue
+ if (gc_jd_queue_dequeue(spu_args.queue, &jd_ea, &jd)){
+ process_job(jd_ea, &jd);
+ backoff_reset();
+ }
+ else
+ backoff_delay();
+
+#endif
+
+ // any msgs for us?
+
+ if (unlikely(spu_readchcnt(SPU_RdInMbox))){
+ int msg = spu_readch(SPU_RdInMbox);
+ // printf("spu[%d] mbox_msg: 0x%08x\n", spu_args.spu_idx, msg);
+ if (MBOX_MSG_OP(msg) == OP_EXIT){
+ flush_completion_info();
+ return;
+ }
+ if (MBOX_MSG_OP(msg) == OP_GET_SPU_BUFSIZE){
+ spu_writech(SPU_WrOutIntrMbox, MK_MBOX_MSG(OP_SPU_BUFSIZE,
GC_SPU_BUFSIZE_BASE));
+ }
+ }
+
+ // If we've got job completion info for the PPE and we can send a
+ // message without blocking, do it.
+
+ if (comp_info.ncomplete != 0 && spu_readchcnt(SPU_WrOutIntrMbox) != 0)
+ flush_completion_info();
+ }
+}
+
+
+int
+main(unsigned long long spe_id __attribute__((unused)),
+ unsigned long long argp,
+ unsigned long long envp __attribute__((unused)))
+{
+ gc_sys_tag = mfc_tag_reserve(); // allocate a tag for our misc DMA
operations
+ ci_tags = mfc_multi_tag_reserve(2);
+ put_tags = mfc_multi_tag_reserve(2);
+ get_tag = mfc_tag_reserve();
+
+ // dma the args in
+ mfc_get(&spu_args, argp, sizeof(spu_args), gc_sys_tag, 0, 0);
+ mfc_write_tag_mask(1 << gc_sys_tag); // the tag we're interested in
+ mfc_read_tag_status_all(); // wait for DMA to complete
+
+ // initialize pointer to procedure entry table
+ gc_proc_def = (gc_proc_def_t *) spu_args.proc_def_ls_addr;
+
+ backoff_init(); // initialize backoff parameters
+
+ main_loop();
+ return 0;
+}
Deleted: gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_spu_procs.c
Deleted: gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gcell_spu_main.c
Copied: gnuradio/branches/developers/eb/gcell-wip/src/spu-include/gc_delay.h
(from rev 7919,
gnuradio/branches/developers/eb/gcell-wip/src/lib/spu/gc_delay.h)
===================================================================
--- gnuradio/branches/developers/eb/gcell-wip/src/spu-include/gc_delay.h
(rev 0)
+++ gnuradio/branches/developers/eb/gcell-wip/src/spu-include/gc_delay.h
2008-03-04 15:51:02 UTC (rev 7921)
@@ -0,0 +1,27 @@
+/* -*- c++ -*- */
+/*
+ * Copyright 2007,2008 Free Software Foundation, Inc.
+ *
+ * This file is part of GNU Radio
+ *
+ * GNU Radio is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 3, or (at your option)
+ * any later version.
+ *
+ * GNU Radio is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, write to the Free Software Foundation, Inc.,
+ * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+ */
+#ifndef INCLUDED_GC_DELAY_H
+#define INCLUDED_GC_DELAY_H
+
+void gc_udelay(unsigned int usecs);
+void gc_cdelay(unsigned int cpu_cycles);
+
+#endif /* INCLUDED_GC_DELAY_H */
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [Commit-gnuradio] r7921 - in gnuradio/branches/developers/eb/gcell-wip: . src/apps src/apps/spu src/lib src/lib/spu src/spu-include,
eb <=